HMM-based MAP Prediction o Formant Frequencies from N

نویسنده

  • Jonathan Darch
چکیده

This paper describes how formant frequencies of voiced and unvoiced speech can be predicted from mel-frequency cepstral coefficients (MFCC) vectors using maximum a posteriori (MAP) estimation within a hidden Markov model (HMM) framework. Gaussian mixture models (GMMs) are used to model the local joint density of MFCCs and formant frequencies. More localised prediction is achieved by modelling speech using voiced, unvoiced and nonspeech GMMs for every state of each model of a set of HMMs. To predict formant frequencies from a MFCC vector, first a prediction of the speech class (voiced, unvoiced or non-speech) is made. Formant frequencies are predicted from voiced and unvoiced speech using a MAP estimation made using the state-specific GMMs. This ‘HMM-GMM’ prediction of speech class and formant frequencies was evaluated on a male 5000 word unconstrained large vocabulary speaker-independent database.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Predicting Formant Frequencies from MFCC Vectors

This work proposes a novel method of predicting formant frequencies from a stream of mel-frequency cepstral coefficients (MFCC) feature vectors. Prediction is based on modelling the joint density of MFCCs and formant frequencies using a Gaussian mixture model (GMM). Using this GMM and an input MFCC vector, two maximum a posteriori (MAP) prediction methods are developed. The first method predict...

متن کامل

Formant frequency prediction from MFCC vectors in noisy environments

This paper proposes a method of predicting the formant frequencies of a frame of speech from its mel-frequency cepstral coefficient (MFCC) representation. Prediction is achieved through the creation of a Gaussian mixture model (GMM) which models the joint density of formant frequencies and MFCCs. Using this GMM and an input MFCC vector, a maximum a posteriori (MAP) prediction of the formant fre...

متن کامل

Formant Prediction from MFCC Vectors

This work proposes a novel method of predicting formant frequencies from a stream of mel-frequency cepstral coefficients (MFCC) feature vectors. Prediction is based on modelling the joint density of MFCC vectors and formant vectors using a Gaussian mixture model (GMM). Using this GMM and an input MFCC vector, two maximum a posteriori (MAP) prediction methods are developed. The first method pred...

متن کامل

Modelling and ranking of differences across formants of british, australian and american accents

The differences between formants of British, Australian and American English accents are analysed and ranked. An improved formant model based on linear prediction (LP) feature analysis and a two-dimensional(2D) hidden Markov model (HMM) of formants is employed for estimation of the formant frequencies and bandwidths of vowels and diphthongs. Comparative analysis of the formant trajectories, the...

متن کامل

Formant analysis and synthesis using hidden Markov models

This paper describes a unifying framework for both formant tracking and speech synthesis using Hidden Markov Models (HMM). The feature vector in the HMM is composed by the first three formant frequencies, their bandwidths and their delta with time. Speech is synthesized by generating the most likely sequence of feature vectors from a HMM, trained with a set of sentences from a given speaker. Hi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006